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1. Introduction 

There is an intimate connection between (nonparametric) maximum likelihood estimators for inverse prob- 
lems and integral equations, a connection that does not seem to be well-known. In the present paper I will 
concentrate on (smoothed) maximum likelihood estimators for interval censored data, but maximum likeli- 
hood estimators for deconvolution will also be discussed. I will show that integral equations play a crucial 
role in the development of distribution theory for so-called "smooth functionals" (of which moments are the 
prototype), based on the maximum likelihood estimator (MLE), but also in the development of the local 
limit theory of the MLE. 

In [16] the maximum smoothed likelihood estimator (MSLE) was studied for the current status model, 
the simplest interval censoring model. It is called the interval censoring, case 1, model in [11] and [20]. It 
was shown in [16] that, under certain regularity conditions, the MSLE, evaluated at a fixed interior point, 
converges at rate to the real underlying distribution function, if one takes a bandwidth of order 

^-1/5 'pjjjg convergence rate is faster than the convergence rate of the non-smoothed maximum likelihood 
estimator, which is n~^/^ in this situation, as shown in [11] and [20]. Moreover, the limit distribution is 
normal, in contrast with the limit distribution of the non-smoothed maximum likelihood estimator. 

In the more realistic interval censoring model, there is an interval in which the relevant (unobservable) 
event takes place. This situation is in fact much more common, in particular in medical statistics. It is called 
the interval censoring, case 2, model in [11] and [20]. In [13] the local distribution theory for the MSLE was 
developed for this model and it was shown that, under a condition which is called the "separation condition" , 
the MSLE converges at rate n~^/^ if the bandwidth is of the usual order n~^/^ and that the MSLE has a 
normal limit distribution, again in contrast with the ordinary MLE. Here a (non-linear) integral equation 
plays again a crucial role. 

As noted in [13], the MSLE has the advantage over the MLE and the smoothed maximum likelihood 
estimator (SMLE) that it can be used in situations where the MLE or SMLE cannot be used. For example, 
the MLE itself is proved to be inconsistent for the current status continuous mark model (see [24]), and the 
SMLE will inherit the bad properties of the MLE in this situation, and also be inconsistent. On the other 
hand, a version of the MSLE, based on histograms, is proved to be consistent for this model in [17]. A similar 
phenomenon holds for the two-dimensional right-censoring model, where the MLE is inconsistent (see [25]) 
and the SMLE will not make this better. In this case the MSLE will, under appropriate smoothing of the 
observation distribution, also be consistent. 

The MSLE can be viewed as an estimator minimizing a Kullback-Leibler distance and is therefore a 
natural generalization of the MLE, which minimizes the Kullback-Leibler distance of the distributions in 
the allowed class w.r.t. the unsmoothed empirical observation distribution (of course the Kullback-Leibler 
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distance is not a real "distance" , but we follow the common convention of calling it a distance here) . The 
difference between the MLE and the MSLE is that, in computing the MSLE, one starts by smoothing the 
empirical observation distribution of the data, and next looks for a distribution in the allowed class, closest 
to this smoothed observation distribution in Kullback-Leibler distance. In this way one can prevent the 
inconsistency properties of the MLE, as observed in [24] and [25], which have as common cause that the 
MLE tries to distribute mass on lower dimensional surfaces without using the surrounding information of 
the (higher dimensional) data. 

We note here in passing the peculiar fact that in this Kullback-Leibler minimization, the part involving 
the minimization is "on the wrong side" of the argument. In large deviation theory (in particular the large 
deviation theory associated with efficiency computations for test statistics), one usually has to deal with 
minimizing 

1C{Q,P) 

over Q for fixed P, where JC{Q, P) is the Kullback-Leibler distance between the probability measures Q and 
P, see, e.g., [10] and [19], but in the minimization needed to compute the M(S)LE, one has to minimize 

/c(g,p) 

over P for fixed Q, and this minimization problem is essentially different, and less theory is available. This 
has to do with the asymmetry of the Kullback-Leibler distance, which, for example, is not present with the 
Hellinger distance, which is a real distance. 

We start in section 2 by studying smooth functionals for interval censoring, where we discuss the theory, 
developed in [7], [8] and [9]. The notation, introduced here, will be used in the remainder of the paper. 
Section 3 discusses a local limit result for interval censoring, the separated case, which was proved in [12] 
(see Theorem 3.3), and discusses a conjecture for the non-separated case (Theorem 3.1). Section 4 discusses 
a limit result recently proved for the MSLE for interval censoring, showing that, under appropriate regularity 
conditions, the rate of the MLE can be improved to n~^/^ by using the MSLE instead of the MLE. Also, 
the limit distribution is normal here, in contrast with the limit behavior of the MLE. 

Section 5 takes a more heuristic turn, in the hope that researchers will pick up on this interesting topic, 
where there are still many open problems. It is based on "Nachdiplom" lectures I gave at the ETH Ziirich, in 
the fall of 2007, on the invitation of Sara van de Geer, and it is the first time these lectures appear (partly) 
in print. During these lectures, I tried to develop the theory of the integral equations, associated with 
deconvolution. The big hurdle here is the fact that the relevant efficient influence functions are unbounded 
near the edge of the domain on which they are defined. These functions can also only be numerically 
determined by solving the associated integral equations and do not have explicit representations, except in 
the case of uniform and exponential deconvolution. Nevertheless, pursuing this approach seems worthwhile, 
since there is little doubt in my mind that the MLE will automatically give efficient estimates of smooth 
functionals here, just as in the case of interval censoring, in contrast with the usual estimates, based on 
Fourier methods. Section 6 discusses results and conjectures for the local limit behavior of the MLE for 
deconvolution. Some of the conjectures go back more than 20 years, but have been proved for special cases 
in the mean time. 



2. Smooth functionals in the interval censoring model 

We recall the interval censoring, case 2, model. Let Xi, . . . ,X„ be a sample of unobservable random vari- 
ables from an unknown distribution function Fq on [0, cx)). Suppose that one can observe n pairs {Ti,Ui), 
independent of Xj, where Ui > Ti. Moreover, 

Ail l{x,<Ti}, =^ l{T.<x,<;7.} and A^g 4?^ 1 - A^i - A^a, (2.1) 

provide the only information one has on the position of the random variables Xi with respect to the observa- 
tion times Ti and Ui. In this set-up one wants to estimate the unknown distribution fimction Fq, generating 
the "unobservables" Xi. 

If Fq is an absolutely continuous distribution function, the MLE converges locally in distribution only at 
rate n~^^^. But if one wants to estimate a so-called "smooth functional", one can use the MLE to construct 
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an asymptotically efficient estimate which converges at rate n~^/^. Assuming that the support of the density 
/o, corresponding to the distribution function Fq, has support contained in an interval [0,M] C [0, oo), the 
smooth functionals of interest are of the form: 



(Dl) K{F) = K{Fo) + J KF,{x)d{F-Fo){x) + 0{\\F-Fo\\l), 

for distribution functions F with support contained in [0, M], where ||-F— Fo||2 is the i2-distance between the 

distribution function F and _Fo w.r.t. Lebesgue measure on [0, Af]. The function Kp is called the "canonical 
gradient" of the functional K w.r.t. F (also called "efficient influence function"), and is supposed to belong 
to the space L2{F), of square integrable functions a w.r.t. the measure dF, satisfying J adF = 0. The 
derivative of the function x h- > Kp{x) will be denoted by kp{x). 

Condition (Dl) holds for a wider class of functionals than just the class of linear functionals. For linear 
functionals 

F^ c{x)dF{x), 
Jo 

we have Kp{x) — c{x) — J c{x)dF{x), and (Dl) even holds without the O-term. However, the functional 

K{F) = j F^{x)w{x)dx 



where w is a bounded weight function, has canonical gradient 



kf{x)=2 F{s)w{s)ds- / 2F{s)w{s)ds 

J s=x J x=0 J s=x 



dF{x) 



and also satisfies (Dl). For we have 



plVl plVl 

K{F)-K{Fo)= F{xfw{x)dx- Fo{xfw{x)dx 
Jo Jo 

= {F{x)- Fo{x)}^ w{x)dx + 2 F{x)Fo{x)w{x) dx - 2 Fo{xfw{x)dx 
Jo Jo Jo 

nM 

= 2 / {F{x)-Fo{x)}Fo{x)w{x)dx + 0{\\F-F4'') 

Jo 

= 2 / Fo{s)w{s)dsd{F-Fo){x) + 0{\\F-Fof) 

Jo J s=x 





M 

KF,{x) d{F- Fo) {x) + O {\\F - Fof) . 

The last equality holds since each constant integrates to zero w.r.t. d{F — Fq) on the interval [0, Af]; this 
constant has to be subtracted in the representation of Kpf^ to let the gradient belong to L2{Fq), an important 
property that often seems to be overlooked in this kind of computation. In the next to last line of the display 
above we simply use integration by parts. 

Condition (Dl) suggests a recipe for proving efficiency and asymptotic normality of K{Fn), if Fn is the 
(ordinary unsmoothed) MLE. We first try to establish that 

||F„-J^of =Op(n-i/2), (2.2) 
and next try to prove, using the characterizing properties of the MLE, 

nV^y KF,{x)d{Fr,-Fo)ix) = n^/^ J eFo(i, ^i, ^2) (Q„ - Qo) (i, w, -^i, ^2) + Oj, (n'^^j , (2.3) 
where is given by: 

eF,{t,u,di,62)=E{KF,{X) I (ri,f/i,An,Ai2) = (i,M,<5i,52)}. (2.4) 
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It is at this point that the integral equations come into play, because Op,^ also has the representation 

where the function (pp^ is a solution of the integral equation (equation (6) on p. 77 of [7] and (5) on p. 204 
of [8]): 

PpM = dp^X^){^i.pM)- f ^f^Yi^f^ sMdu^ r ^f;^\-^;fK Mdv\ (2.6) 
and where dpg{x) is given by 

Fo{x){l - Fo{x)} 



dPo (x) = 



gi{x){l-Fo{x)}+g2{x)Fo{x) ' 



Here the indicators 6^, k = 1,2, correspond to the indicators Aj^. in (2.1); Q„ is the empirical measure of 
the observations (T^, Ui, An, Ai2), i = 1, . . . ,n, and the corresponding underlying probability measure is Qo- 
The observation times (Ti, Ui) have density g, with first marginal gi and second marginal g2- How one gets 
from the canonical gradient Kp^ in the hidden space to the canonical gradient Op^ in the observation space is 
further explained in [7], [8] and [12], where also the connection with theory, developed in [27], is explained. 

If one succeeds in proving relation (2.3), one has in one stroke established both asymptotic normality and 
asymptotic efRciency of K[Fn). In fact, one then gets, also using (2.2), 

{i^(F„) - K{F,)] A 7V(0, a| J, (2.7) 

where iV(0, ctq^) is a univariate normal distribution with expectation zero and variance 

^Qo = j dpa{t,u,5i,52Y dQo{t,u,5i,52), 

which in terms of <j)p^ becomes: 

The asymptotic variance CTq^ is the smallest asymptotic variance any regular estimator can attain. 
The limit result (2.7) is proved in [8] under the strict separation hypothesis 

P{C/,-F, >£} = !, 

for some e > and some additional regularity conditions, and for the case that the joint density of ([/j, Vi) is 
positive on the diagonal (in which case there can be arbitrarily small observation intervals {Ui, Vi)) in [9]. We 
will only give some background to the relations (2.2) and (2.3) for the separated case here. To discuss this 
in a simple setting, satisfying the conditions for the validity of (2.7), we take as Fq the uniform distribution 
function on [0, 1] and as g the uniform density on the upper triangle of the unit square with vertices (0,£), 
(0, 1) and (1 — e, 1), where e G (0, 1). Moreover, we take 

npix)^x-JudFiu), 

so the smooth functional we want to estimate is just the first moment of the distribution of X. This means 
that the integral equation (2.6) boils down to: 

4>P^Xx) = dp,ix)ll- r 'I^^M^^tlMg[u,x)du+ t ^-^M^^^g{x,v)dv\, (2.9) 
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where 

g{t,u) 



2^{u-t>s} 



, . 2(1 — < — e) , , , , 2(u — e) 

5iW = -^^^^^\^s-A(t) , ff2(«) = ■ (2-10) 



Even in this simple setting, the integral equation (2.6) does not have a simple solution and we have to develop 
general theory to show that a solution exists. 

The integral equation (2.9) (and more generally, (2.6)), is a so-called Fredholm integral equation of the 
second kind, see, e.g. [23]. A picture of the solution ^p^^ of (2.9) is shown in Figure 1, where we took e = 0.1. 
By (2.5) and (2.7), the asymptotic variance of 



^1/2 / a;d(F„-Fo)(x) 



(2.11) 



is therefore given by (2.8), with (f>Fg solving (2.9). Numerical solution of the integral equation (2.9) for e = 0.1 
yielded 



9i{t) 



Fo{u)-Fo{t) l-Fo(u) 



■ g{t,u) dtduKi 0.11427, 



(2.12) 



and a simulation study, using 10,000 samples of size n = 1000, yielded a variance 0.11470 of the values of 
(2.11), so for sample size n — 1000 the actual variance is close to the asymptotic variance, given by (2.8). 




Fig 1. The function ipFo' solving the integral equation (2.9), for e = 0.1. 



Because of the separation property g{t,u) — 0, u ~ t < e and the hypothesis that i^g has a continuous 
strictly positive density /o on its support (which, in the case of (2.9) is [0, 1]), the integrating factors in the 
integrals on the right-hand side of (2.6) and (2.9) are bounded. So we have a Fredholm integral equation of 
the second kind with a bounded integration kernel. 

A further key to the treatment of the integral equation (2.9) is the following important observation. 
Assuming existence and uniqueness of the bounded continuous solution (j)pg of (2.9) (which is proved in [7]), 
we have: 

def def 
min dpoix) <m— min (f>pg{x) < M = max (ppg{x) < max dFg{x). (2-13) 



re [0,1] 



xG[0,l] 



a;G[0,l] 



i:e[0,l] 
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The upper bound for the maximum M follows in the following way. Let xq G [0, 1] be a point where the 
bounded continuous solution (pp^ attains its maximum M. Then 

M = (pFo[Xo) = dFo(xo) - / g[u,xo)du- / g[xo,v)dv 

L Ju=a xo~u V- xo 

< dFoi^o) l£ max dFo{x). 
xe[aA] 

The bound for m is derived in a similar way. We similarly can obtain bounds for the derivative of (t>Fo (using 
the bound we got for (j)Fo itself). 

Having established some properties of the solution c/fFg of the integral equation (2.9), we can explain why 
we can expect (2.2) and (2.3) to hold. First of all, we get 

\\Fn - Fah ^ Op (ri-'^^y (2.14) 

This can be proved by using some (by now) standard empirical process theory, as developed, for example, 
in [3] and [1]. Defining 

QFit, u, 5ij2) = SiF{t) + j{F{u) - Fit)} + {l-6i- <52){1 - F{u)}, 

it is first proved, using [3] (or [1]), that the Hellinger distance h{qp ,qFo), defined by 

1/2 



satisfies 



Hqp^,qF,f ^Op(n-^/^y (2.15) 
(part (i) of Corollary 2 in [8]), and next, using (2.15) and the inequalities 



^'n-i^o) <4 and LF„-^^o) < 4 J 1 - - Vl - ^0 



that 



|F„-Fo||J = 0,(n-2/3). 



(part (ii) of Corollary 2 in [8]), which gives (2.14). 

Next we prove (2.3), using the following crucial lemma. 

Lemma 2.1. Let, in analogy with (2.5), 9p he defined by 

9p (t, u, 61,62) = -61 - 62 ^— + il-6i- 62) , (2.16) 

Fn{t) F„[u)-F„[t) 1-Fn[u) 

where the function (p p solves the integral equation 

f r (t>p (x) - (t)p (u) (j)p (v) - (j)p (x) ] 

(f>p (x) = dp (x) < 1 ~ / g(u, x)du+ / ^ g(x, v) dv > , (2.17) 

^" ^" \ 1=0 F„(a;)-F„(u) F„(t;)-F„(x) ' ^ J ' 

and where dp (x) is given by 

F„{x){l - Fn{x)} 



dp (x) = 



<?i(a;){l - F„{x)} + g2(.x)F^{x) 

Then 

' xd{Fn-Fo)ix) = ^ [ Bp^{t,u,6i,62)dQo{t,u,6i,62). (2.18) 
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Lemma 2.1 is a special case of Lemma 1 on p. 214 of [8] and is the first step in proving (2.3). It gives 
a representation of the statistic of interest in the hidden space (the expression on the left side of (2.18)) in 
terms of a statistic in the observation space. The more general lemma in [8] holds for sufficiently well-behaved 
distribution functions F instead of just and the fact that F„ is the MLE is not used in the proof. We 
need, however, that (2.17) is a well-defined integral equation (at least for sufficiently large n, with probability 
tending to one), and this is not immediately clear. For example, the denominators of the integrands could 
be zero or arbitrarily close to zero. Moreover, F„ has jumps, so we have to deal with a mix of (absolutely) 
continuous functions and functions with jumps in this equation. 

We now sketch the procedure the approach, taken in [8]. Let Jj = [Ti,Ti_|_i) be the intervals of constancy 
of Fn, where tq — Q and r,„+i = 1. We define a piecewise constant version (j)p of 4>p in the following way: 



(l)pjx) 



(f)p^{s), if 3s e Ji, such that F„(s) = Fo{s), 

(l>F„i'^i+i-)^ if ^o(a;) < Fnin), for x £ J„ 
(f>p (Ti), if Fo{x) > FniTi), for x e J^. 



(2.19) 



We next define 



9p {t, u, 5i,S2) = -5i ^"^ ^ 62 



Fn{t) 



(j)p(u)-(f)p(t) (('fA^) 
+ (1 - di - 62) 



Fn{u) - Fn{t) 



l-F^{u) 



(2.20) 



Since (pp is absolutely continuous w.r.t. F^ we get: 



9^Jt,u,5i,(52)rfQ„(i,w,5i,'52) = 0. 
Hence, using Lemma 2.1, we get from (2.18), 
xd{F^-Fo){x) 

9p^ {t, u, 61,62) dQo{t, u, 61,62) 
9p^{t,u,6i,62)d{Qn - Qo){t, 1^,61,62) + J ^9p^{t,u,6i,62) - 9p^{t,u,6i,62)^ dQQ{t,u, 61,62). 
Standard empirical process theory yields: 

J epJt,u,6i,62)d{Qn-Qo){t,u,6i,62) ^ NiO,alJ, 
where ctq^ is defined by (2.12). Here we use 



^F„ rl'o 



= O 



which is proved in [8]. In fact, defining 



Fn{l-F„) 



Fn{l~F^) 

it can be proved (see (31), p. 16 of [8] and Lemma 4 of [7]) that, for all x £ [0, 1], 



Cr, i^) ~ ^F„ i^) 



< Cl 



F,,{x)~Fq{x) 



and 



4>F {^)- 4>Fo{x) 



< C2 



Fn{x)~Fo{x) 



for positive constants ci and C2. These properties of the function (l)p are derived from the integral equation 
(2.17). 
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Note that (pp has an absolutely continuous and a diserete part. Lemma 4 of [7] tells us that if x and y 
both belong to an interval Ji between jumps, we have: 

for a positive constant Ki, independent of Ji, and that if Fn has a jump at x, we get: 

^pjx) - < K2\Fn{x) - Fn{x-)\ 

for an positive constant K2. So the discrete part of 0^ is absolutely continuous w.r.t. Fn- 
Finally, by Lemma 2, p. 215, of [8] we get: 



(i, u, ^1,^2) - (t, u, 61,62)} dQoit, u, 61,62) - Op (ri"'/') 



Going through steps of this type seems unavoidable if one wants to prove a result of type (2.7). 

For the non-separated case a result of type (2.7) was proved in [9], see Theorem 3.2 on p. 647 of [9]. The 
result is also discussed in [12]. In this case one can no longer use the integral equation (2.17) directly because 
of the singularities of the integrand, but instead has to use a modified form of this integral equation in a 
transformed scale. The details are omitted here. A full discussion can be found in [12]. 



3. Local limit theory for the MLE in the interval censoring model 

The distinction between the separated case {V{Ui — < e} = for some e > 0) and the non-separated case, 
where we can have arbitrarily small observation intervals {Ui, Vi), plays an even more prominent role in the 
local limit theory than in the theory for the smooth functionals. 

For the non-separated case the following conjecture was launched in [11] (and repeated in [20]). 

Theorem 3.1 (Conjecture in [11]). Let Fq and H be continuously differentiable at to and (to, to), respectively, 
with strictly positive derivatives fo{to) and h{to,to), where H is the distribution function of {Ti,Ui). By 
continuous differentiability of H at {to, to) is meant that the density h(t,u) is continuous at {t,u), if t < u 
and {t,u) is sufficiently close to {to, to), and that h{t,t), defined by 

h{t,t) —\mih[t,u), 

is continuous at t, for t in a neighborhood of to . 

Let < Fo{to),H{to,to) < I, and let F„ be the MLE of Fq. Then 

in\ogn)'/^[K{to)-Fo{to)} / {yoito)Vhito,to)}'^' ^2Z, 

where Z is the last time that standard two-sided Brownian motion minus the parabola y{t) — t^ reaches its 
maximum. 

It was also shown in [11] that Theorem 3.1 is true for a "toy" estimator, obtained by doing one step of 
the iterative convex minorant algorithm, starting the iterations at the underlying distribution function Fq; 
the "toy" aspect is that we can of course not do this in practice. In spite of the fact that now more than 
twenty years have passed since this conjecture has been launched, it still has not been proved. 

For the separated case one can also introduce a toy estimator of the same type and one can again 
formulate the "working hypothesis" that the toy estimator and the MLE have the same pointwise limit 
behavior. Anticipating that this would hold, the asymptotic distribution of the toy estimator is derived in 
[28] for the separated case, under the following conditions. 

(CI) The support of Fq is an interval [0, M], where M < 00. 

(C2) Fq and G have densities /o and g w.r.t. Lebesgue measure on M and M^, respectively. 
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(C3) Let the functions ki^e and fc2,e be defined by 

Ju Fo{v) - Fo{u) 

and 

k2Av)- r p/l^'w , {Foiv)~Foiu)<s-'}du. 
Jo Fo[v)-Fo[u) 



Then, for i = f , 2 and each e > 0, 



hm a / ki(u,ea) du = 0. 

J(t„.t„+t/a] 



'{to,to+t/a] 



(C4) < Foito) < 1 and < H{to,to) < I. 



The motivation for these conditions is given in [28] and actuahy become clear from the proof, which is not 
given here. 

Theorem 3.2 ([28]). Suppose that assumptions (CI) to (C4) hold. Let ki, z = 1,2, be defined by 
ki[u) ^ / ^TT"^ S^^"^' andk2(v)^ / — — du, 



Faiv)-Foiu) ' ' Jo Fo{v)-Fo{u) 

and suppose that /o, ffi, 32: cii^-d- ^2 continuous at tp, where gi and g2 are the first and second marginal 
densities of g, respectively. Moreover, assume /o(io) > 0. Then, if F^^^ is the estimator of the distribution 
function Fq, obtained after one step of the iterative convex minorant algorithm, starting the iterations with 
Fq, we have 

n'^'{2ato)/fo{to)V^'{Fi'Hto) - Foito)} ^ 2Z, 

where Z is the last time where standard two-sided Brownian motion minus the parabola y(t) — t^ reaches its 
maximum, and where 

s 51(^0) I L /. \ I 7 \ I 52(^0) 

^(^0) = TwTT + fcl(io) + ^2(^0) + 

It is indeed proved in [12] that, under shghtly stronger conditions (the most important one being that 
an observation interval always has length > e, for some e > 0), the MLE has the same limit behavior, 
using the same norming constants. The expression for the asymptotic variance in the separated case is 
remarkably different from the conjectured variance in the non-separated case, which only depends on Ff) via 
foiivi), showing that only the local behavior, depending on the density at ioi is important for the asymptotic 
variance (assuming that the working hypothesis holds). 

Note that if {Ti,Ui) is uniform on the upper triangle of the unit square, with vertices (0,e), (0,1) and 
(1 — £, 1), we have: 

2{l-u-e) 2{v-e) 

and, if Fq is the uniform distribution function on [0, 1], 

21og{(l-u)M 21og(z;/£) 

so 

2 jl-to-e fto{l-to)\ , to 

^i*o) = T, Zv? 1 — : + log 



(1 - e)H to £2 ; ' 1 - to 

in this case. 

Note that the scaling constants in the (conjectured) Theorem 3.1 and in Theorem 3.2 are of a different 
order: in Theorem 3.1 the order is {n log n)^^/"^ and in Theorem 3.2 the order is n~^^^ (which is also the order 
of convergence of the MLE for current status data) . One of the reasons to believe that the rate of the MLE 
for the non-separated case is indeed of order (nlogn)"^/^ is the fact that in [2] a histogram-type estimator is 
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constructed which locally achieves this rate. Moreover, a simulation study in [18] which compares the MLE 
with the histogram estimator in [2] shows that the MLE actually has a smaller variance than the histogram 
estimator for the cases analyzed there and has for large samples a variance which is close to the conjectured 
asymptotic variance. Nevertheless Theorem 3.1 still has to be proved, no doubt using the associated integral 
equations. 

We shall now sketch how the integral equations enter into the proof of the local limit result for the 
separated case. As in the preceding section, we assume that Fq is defined on [0,M] and has a continuous 
derivative /o, staying away from zero on [0,M] (defining /o at the boundary points by its left and right 
limits). The proof starts by showing that 

sup j {Fn{u)~Fo{u)}du^Op{l), (3.1) 
te(o,Af) Jo 

see Lemma 4.4 on p. 146 of [12]. This is done by studying the integral equation 

using the same notation as in the preceding section, see, e.g., (2.17). This equation has a right-continuous 
solution (t)t,F for each t £ [0, M], if we restrict the distribution functions F to the set 

Fs = \Fe Ml : sup \F{x) - Fo(x)| < 5) , 
y ' ' ' xe[a.M] J 

where is the set of discrete distribution functions on [0, M] with finitely many points of jump, and 

where we choose S > sufficiently small. Note that we may assume, with probability tending to one, that 
Fn belongs to J^s for sufficiently large n. 

According to Lemma 4.3 of [12], the set of discontinuities of the solution (j>t,F of the integral equation 
(3.2) is contained in the set of discontinuities of F, augmented by the point t (which is the only jump of the 
function l[o.t) on [0, M]). Furthermore, again according to Lemma 4.3 of [12], we get, if x is a point of jump 
of (l>t,F, 

\^tA^) - < c{F{x) ~ F(x-) + 1}, 

for some c > only depending on Fq and 6. For points x < y in an interval not containing jumps of F we 
have: 

\(t>t.F{y) - (f>t,F{x)\ < c'{y - x), 

for some constant c' > 0, again only depending on Fq and 5. 
Defining, as before (see (2.16)): 

a U X X s x'^t.FS^) , <l>t,FA^) - ^t.pSt) , , '^t,F„(") 

Uf p {t,U, 01,02) — —Oi— 02 — h (1 - dl - 02) ~ , 

Fn{t) F^{u)-F„{t) 1-F„(u)' 

we get, following the proof of Lemma 4.4 in [12], 







{Fn{u) - Fo{u)} du ^ / 9^ p^{t,u, 61,62) dQo{t,u, Si, S2). 



This reduces again the functional of interest to an integral in the observation space. We would have the 
result (3.1) if we could write: 



t,F„ 



{t, u, 61,62) dQo{t, u, 61,62) = / 9^ p^ {t, u, 61,62) d{Qo ~ Q„) (t, u, 61,62). 



This would be true if (j)^ p would be absolutely continuous w.r.t. Fn. But since this is not the case, we 
take a function 1/)^ p close to (j)^ p which is, apart from possibly having a jump at the point t, is absolutely 
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continuous w.r.t. F„. If t belongs to the interval (r^, r^+i) between successive jumps of 4>^ p is defined 

by 



on the other intervals 0^ ^ can be defined in the same way as in (2.19). Analogously to what we did before, 
we define 9^ p by: 

Of A (t,U, 01,02) = ~0i — ; 02 — h (1 — C»l — 02 ~ ■ 

Then: 



/ {^'t,F„ - ^t,F„ } dQo < \\K - i^oll^ = Op ( 



n 2/3 



(see p. 147 of [12]). 

Returning to the functionals 



t ^ V„(i) / {K{u) - i^o(u)} dw, 



we get that, if x 1—5. Fn{x) — F(){x) is of constant sign on an interval Ji — [Ti,Ti^i), 

sup \i^niu)\ < max{|l/'„(Ti)| , |V'«(Ti+l)|} , 

ueJi 

since the function ipn is then either increasing or decreasing on Ji. If, on the other hand Fq and Fn cross on 
the interval Ji, ipn first increases and then decreases after the crossing point, noting that Fn is constant and 
that Fq increases on Ji, so we get, if t S (t^, t^+i). 



o'lQo -Qn)+ I e^ p^ dQ„ + Op 



< y ^t,F„ d{Qo - Qn) + Op , 

where we use 

which is a consequence of the so-called Fenchel duality conditions, characterizing the MLE (see (4.38) on p. 
147 of [12]). So we have, apart from a remainder term of order Op{nr'^/^), in all cases bounded V'n(^) by the 
values of '0n at the points and an integral of the form 



\F„C«(Q0-Qn)- (3.3) 

But ipniTi) can be written 

V'n(T,) - J e^^ p^ dQo = J B^^ p^ dQo + Op {n-^^^) = J O^^ p^ d{Qo - Q„) + Op (n-^"') , 

using that for t = Ti the function (j)^ p is absolutely continuous w.r.t. Fn (the jump of the function l[o,t) is 

in that case at the same location as a jump of Fn). So we have bounded ipnit) by the empirical integrals of 
the form (3.3), and the result now follows by standard empirical process theory. 
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Having established (3.1), we now also get for a class of functions Q of right-continuous functions g 
[0, M] — > M of uniformly bounded variation: 



sup 



M 



9{x){Fn{x) - Fq{x) dx 



(3.4) 



see Corollary 4.3 on p. 149 of [12]. Using (3.4) we can first of all establish: 

sup \Fn{x)-Fo{x)\=oJn-^''), 

2:G[0,A/] ^ ' 

see Corollary 3.4 in [12]. 

Next we observe that for an (open, closed or half-open) interval J„, Fn satisfies 



I 52 52 i 

teJ„ \ Fn{u) - Fn{t) Foiu) - F„{t) J ' 
{Foju) - Fo{t)}{K{u) - Foju)} 
{Fn{u) ~ Fn{t)}{Fo{u) - Fr,{t)} 



g{t, u) du 



52{Fn[u) - Fo{u)] 
teJ„ {Fn{u) - Fn{t)}{Fo{u) - F„(<)} 
52{F„{u) ~ Foiu)} 



teJ„ {Fniu) - F„{t)}{Fo{u) - F„(t)} 



n — wo)- 

(3.5) 



For the first integral on the right-hand side of (3.5) we get from (3.4): 

{Fniu) - Foiu)} 



teJ„ 



{Foiu) - Foit)}g,it) 



{F,,iu) - Fnit)}{Foiu) - F^it)} 



giu\t) du \ dt = Op | j^j^ ^ 



where | J„| denotes the length of the interval J„, and for the second integral on the right-hand side of (3.5) 
we have, if |J„| = Opin-'^^^), 



L 



52{Fniu) - Foiu)} 



/teJ„ {Fniu) - Fnit)}{Foiu) - Fnit)} 
implying that if J„ is of order Op(n^^/*) both terms are of order Opin^^^^). Since 

I '^2 52 \ ,„ 



lueJ^[Fniu)~Fnit) Fniu)~Foit) 
can be treated in a similar way, we get: 



I ^1 S2 

t€J„ [Fnit) Fniu)~Fnit) 
I ^1 52 



./„lK(0 Foiu) -Fnit) 



J 52 l-5i~52 { 

«eJ„ [Fniu) -Fnit) 1-Fniu) J 

/ 52 1-5^-52 

:jA Fniu) - Foit) 1-Fniu) 



}n + Op , 



if I J„| — Op(n^^/^). Notice that this replaces the value of F„ in the "off-diagonal" arguments of the integrand 
by the corresponding value of Fq. 

Using this result, one can in fact derive the improved result 



sup 

te[-c,c] 



F 



nito + n-'/h) - Fo(to)| = Op ("^"'^') 
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see Lemma 4.6 in [12], for each c > and an interior point e (0, M). This, in turn, means that if we let 
J„ be an interval of order 0(n^^/'^) around a fixed point G (0, M), we get: 

teJ„i^^nW F„(w)-F„(t)J AeJ„ [F„(ii)-F„(t) 1 - F„{u) ] 

Ji S2 ] f f 62 l-d,~ 52 

t^J^ L Fo{t) F„{u) - F„{t) J ^" 7„g,,„ 1 Foiu) - Fo(t) 1 - Fo(u) 



In particular, if J„ = [r„,w), where t„ is a point of jump of Fn such that |t„ — to\ = Op{n ^^^), and where 
V ^ Tn + n^^/'^w, w > 0, we get, by the characterization of the MLE: 



< 



^1 ^2 1 7^ , /" f '^2 1 - ^1 - (52 



+ n 



^''[jFoit) - F.ito)) + ,,(,y^} .(^,") ^^du 

-r^^/' / {Fn{u)-FQ{to)}[—-r^-—- + - ]—\ g{t,u) dtdu + Op{l), 

where the inequality on the left becomes an equality if the right endpoint v of J„ is also a point of jump of F„. 
As a function of w (in t„ + rL~^^^w), the first term on the right-hand side converges to a Brownian motion 
process and the second and third term on the right-hand side converge to a parabolic drift added to this 
process. The last two terms converge to the greatest convex minorant of this Brownian motion plus parabolic 
drift process. So the MLE is indeed asymptotically equivalent to the toy estimator, given in Theorem 3.2, 
and its asymptotic distribution is therefore also given by Theorem 3.2. So we have the following result. 

Theorem 3.3. (Theorem 4.4 of [12].) Let the conditions 

(i) gi and 52 are continuous, with gi{x) -1- 52(0;) > for all x G [0,M]. 

(ii) (u,w) H- > g{u,v) is continuous on its support, with uniformy bounded partial derivatives, except at a 
finite number of points, where left and right (partial) derivatives exist. 

(Hi) V{V — U < Eq} — for some Eq with < £0 < M/2, so g does not have mass close to the diagonal. 

be satisfied and let Fq be continuous with a bounded derivative fo on [0,M], satisfying 

fo{x)>c>0,xe (0,M), 
for some constant c > 0. Then we have at each point to £ (0, M): 

n'^H2ato)/foito)V^HFM - Foito)} ^ 2Z, 

where ^ and Z are defined as in Theorem 3.2. Hence Fn has the same asymptotic distribution as the toy 
estimator Fn^ of Theorem 3.2. 
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4. Local limit theory for the MSLE in the interval censoring model 

As mentioned in the introduction, the MLE F„ minimizes, as a function of F, the KuUback-Leibler distance 

, Pn.p) 



over probability measures Pn,F in the allowed class, where Q„ is the empirical measure of the observations 
{Ti, Ui, Ail, Ai2), ? = 1, . . . , n, and Pn,F is a measure, defined by 

W, (52) dPn^F{t, w, 61,62) 

^(t,u,l,0)F{t) + ^j{t,u,0,l){F{u) ~ F{t)} + tp{t,u,0,0){l - F{u)}^ dGn{t,u), (4.1) 

for bounded measurable functions ?/; w.r.t. the product of the Borcl cr-algebra on and the counting 
measure on {(1, 0), (0, 1), (0, 0)}, and where G„ is the empirical distribution function of the observation pairs 

{T^,U,). 

On the other hand, the MSLE minimizes the KuUback-Leibler distance 

/C(Q„,P„,f), (4.2) 
over F, where Q„ is a smoothed version of Q„, defined by 

1p{t, U, 61,62) dQn{t, U, 61,62) 

i}{t, u, 1, 0) dQn{t, u, 1,0)+ / il^{t, u, 0, 1) dQn{t, u, 1,0)+ / i:{t, u, 0, 0) dQn{t, u, 0, 0), (4.3) 



and the three measures on the right-hand side are smoothed versions of the measures Q„(t, u, 1, 0), Qn(t, u, 0, 1) 
and Q„(i,u, 0,0), respectively. Furthermore, Pn.p is defined by 



u, 61,62) dPn,F{t, u, 61,62) 

[V(i, u, 1, Q)F{t) + V(i, u, 0, l){F{u) - F{t)] + V(t, u, 0, 0){l - F{u)]^ dGn{t, u), (4.4) 
where dGn is given by 

dGn{t, u) = dQ„(t, u, 1, 0) + dQn(t, u, 0, 1) + dg„(t, u, 0, 0). 
Minimizing (4.2) is equivalent to maximizing the smoothed log likelihood 

i{F)^ ( \ogF{t)dQn{t,u,l,Q)+ I \og{F{u)~F{t)}dQr,{t,u,Q,l)+ [{l-F{u)}dQn{t,u,0,0) (4.5) 



over F, and the maximizing F, which we will denote by Fn, is called the MSLE. 

We now give a more specific form of (4.5). Let, as before, g be the joint density of the observation pairs 
{Ti, Ui), with first marginal gi and second marginal g2- Moreover, let the densities /igi, ^02 and hg be defined 

by 

hoiit) ^ Foit)gi{t), ho2{u) = {l-Foiu)}g2{u), ha{t,u) ^ {Fa{u) ~ Fo{t)}g{t,u). (4.6) 

We define hnj, j — 1,2, and h„ as the estimates of the densities h^j, j — 1,2, and the 2-dimensional density 
Hq, where 

n 1 ^ 

hniit) = -yKK{t~T,)A,i, K2{u)^ -y,KbAu-U^)A,3, (4.7) 
n ^ — ^ n ^ — ^ 
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K{t,u) ^ -y^KtAt~T,)KbAu-U,)A,2, (4.8) 

i=l 

and 

for a symmetric continuously difFerentiable kernel K with compact support, like the triweight kernel 

K{x) = i (1 - l[-i,i](a;), xeR. (4.9) 

At points near the boundary we use a boundary correction by replacing the kernel if by a linear combination 
of K{u) and uK{u). Details on the latter are given in [13]. As in [13], we take x n"^/^. 
With these definitions (4.5) takes the form: 

e{F)^ J hni{t)logF{t)dt + J hn2{u){l- F{t)}du + J K{t,u)\og{F{u)~F{t)}dtdu. (4.10) 

and the MSLE is the (sub-)distribution function, maximizing (4.10). Figures 2 and 3 show the rather large 
improvement of the MSLE over the MLE when the underlying distribution is smooth. Note that, if we would 
not use boundary kernels, the measure (3„ would be defined by 

^{t, U, 5i,52) dQn{t, u, 61,62) 

ij{t,u,l,0) I^J Kb„{t - x)KbJu - y) dq^{x,y, 1,0)"^ dtdu 

V'(t,u,0,l) Kb^{t-x)KbJu-y)d'Qn(x,y,Q,l)^ dtdu 
+ / V(t,u,0,0)| / Kb„{t~x)Kb„{u-y)d([in{x,y,Q,Q)\ dtdu. 



The analogous expression we obtain if boundary kernels are used, is obvious. 

It is shown in [13] that, under the separation hypothesis and some additional regularity conditions, the 
MSLE is asymptotically equivalent to the solution of a non-linear integral equation. We assume these con- 
ditions (given in Theorem 4.1 of [13]) to be satisfied in the sequel. The relevant integral equation (in F) is 
given by: 

Kx{t){l - F[t)} -K2{t)F{t) 

- - {£„ - £ - »• 

see Lemma 4.5 of [13]. Note that the corresponding equation for the underlying model: 
ho^[t){l-F{t)}-ho2{t)F{t) 



^ ,^„F{t)-F{v) J^^,F{u)-F{t) 

is solved by Fq. Using the implicit function theorem in Banach spaces ([4], Theorem 10.2.1), it is shown in 
[13] that if (hni, hn2, hn) is sufficiently close to (ft-oii ^021 ^o) in the supremum distance, the equation has a 
unique solution F„ in an open ball around Fq, again in the supremum distance. Next it is shown that Fn 
coincides with the MSLE with probability tending to one and that 

||F„~Fo|| =Op(n-3/io) , n^oo, (4.12) 

where || • || denotes the supremum distance (part (ii) of Lemma 4.5 in [13]). Note that starting with the 
non-sharp bound (4.12) is somewhat analogous to the approach in the derivation of the local limit behavior 
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o 




0.0 0.2 0.4 0.6 0.8 1.0 



Fig 2. The MSLE (solid) and MLE (dashed) on [0,1] for a sample of size n = 100 from the distribution function Fo(x) = 
1 — (1 — x)^ (dotted); g is uniform on the triangle with vertices (0,£), (0, 1) and (1 — e, 1), where e = 0.1. The bandwidth for 
the computation of the MSLE was b„ = n~^l^ « 0.398107. 



o 




0.0 0.2 0.4 0.6 0.8 1.0 



Fig 3. The MSLE (solid) and MLE (dashed) on [0, 1] for a sample of size n = 1000 from the distribution function Fo{x) = 
1 — (1 — x)'^ (dotted); g is uniform on the triangle with vertices (0,£), (0, 1) and (1 — e, I), where e = 0.1. The bandwidth for 
the computation of the MSLE was bn = n~^/^ « 0.251189. 
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of the MLE in the preceding section, where first a bound on the supremum distance of order Op(n^^/^) was 
derived. In the derivation of (4.12) the implicit function theorem in Banach spaces is again used, but with a 
different norm for (/i„i, /i„2, hn) (instead of the supremum norm for hn a weaker integral-type norm is used). 

Using the bound (4.12) it is subsequently shown that is close to the solution F„ of the linear integral 
equation 

Fit) - Fo(t) + d,„it) Fo(t)-Foiu) ^" 

gjt, u){F{u) - Foju) - Fjt) + Fojt)} ) 

Foiu) - Foit) " j 

Kiirn ~ Fojt)} ~ h„2it)Fait) 
{1 - Fo{t)}gi{t) + Foit)g2{t) 



where dp^ is defined by 

F,{t){l-F^{t)} 
.gi(t){l-Fo(t)+g2(i)Fo(t) 
In fact, it is shown that if Fn is the solution of (4.13), we have: 



\Fn-Fn\\^0Jn-^/^ 



p 

where || • || again denotes the supremum norm, which is a distance of smaller order than we can expect for 
Fn and F^. 

The linear integral equation has properties which are analogous to the properties of the integral equations 
studied in the preceding sections, but is now an equation in F itself instead of an equation in the associated 
function (ftp - In fact, an essential difference is that we now have asymptotic equalities and normality instead of 
asymptotic inequalities and non-normality. In the case of the MLE we had to infer the asymptotic properties 
via a functional of an associated process (greatest convex minorant of Brownian motion plus a parabolic 
drift), but we do not have to do this in the present case. 

So Fn satisfies 

m-F.it)-,dp^it)\f 9iuMm-m) F iu)+F.iu)} ^^ 



n=0 Foit) -Foiu 

git, u){Fniu) - Foiu) - Fnit) + Foit)} 

U—t 



Foiu) - Foit) 



du , 



Kiit){l ~ Foit)} -K2it)Foit) 
{I - Foit)}g^it) + Foi1^g2[t) 



+ dp,it)\( du- I -^4h^du\, 

' \Ju<t Foit) - Foiu) Foiu) - Foit) f 

As in the preceding section, the "off-diagonal" terms F„(m) — Foiu) in the integrands on the left give a 
contribution of lower order Op(n~^/^) ("smooth functionals" again!) and we find that F„ is asymptotically 
equivalent to the "toy estimator" satisfying 

{F-nt) - Foit)} (l + dpjt) ( / ^ .f/"'g, . dt+ f , .f du 
[ [Ju<t Fo it) - Fo iu) Ju>v Fo iu) - Fo it) 

^ hnlit){l-Foit)}-hn2it)Foit) 

{1 - Foit)}g,it) + Foit)g2it) 

+ dp.it) I f jMy^^du-f J^Mv^.du]- (4.14) 



Foit) - Foiu) Foiu) - Foit) 
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Since, by standard theory for kernel estimators, the right-hand side, multiplied by n^/^, converges in distri- 
bution to a normal distribution, we get that also r?/^{Fn(t) - Fo(i)} converges to a normal distribution (the 
same limit distribution as that of n?/^{F^y{t) — Fo(t)}). The full result, with explicit asymptotic bias and 
variance, is given below. 

Theorem 4.1. Let condition (1.1) oj [13] be satisfied. Moreover, let Fq he twice differentiable, with a bounded 
continuous derivative /o on the interior of [0,M], which is bounded away from zero on [0,Af], with a finite 
positive right limit at and a positive left limit at M . Also, let /o have a bounded continuous derivative on 
(0, M) and let gi and g2 be twice differentiable on the interior of their supports Si and S2, respectively, and 
let gi{l — FQ}+g2FQ stay away from zero on [0,Af], where gi and g2 are the marginals of the the joint density 
g of the pair of observation times (Ti, Ui). Furthermore, let g have a hounded (total) second derivative on the 
interior of its support S, having finite limits approaching the boundary of S. Suppose that Xi is independent 
of {Ti, Ui), and let dp^ he defined by 

, , ^ F^{v){l-F^{v)} 



giiv){l~Fo{v)} + Foiv)g2{v) ' 
Then, if 6„ x n~^/^ , we have for for each v <E (0, M), 

v/n^: |f„(^;) - Foiv) - n (O, aiv)') , 

where 

gi[v){l- Fo(v)\ + Fo[v}g2(v) J 

+ dFo{v)< -rrh. ^n^dt- -f^ -—-du) uK{u)du, 4.15) 

where Hq, Hqi and ho2 are defined by (4-6) and 

aiiv)^l+dp„iv)\f ^,f%,. dt^[ ^j^^^^^duX, (4.16) 
VJt<v Fo{v) - F„{t) Fo{w) - Fo{v) J 

and where N (0,ct(v)^) is a normal distribution with first moment zero and variance cr(w)^, defined by 

2 dF„{v) f r^,^2 



a{vY = / K{uYdu. (4.17) 



5. Deconvolution, smooth functionals 



The theory of MLEs for deconvolution is full of peculiar facts and unsolved problems. We can again expect 
that the use of MLEs will produce efficient estimates of smooth functionals. Whether this will give better 
estimates than naive estimators will depend on the model and in particular on the properties of the tangent 
spaces, associated with the model. We start with a simple example, where the estimate of the first moment, 
using the MLE, coincides with a moment estimate. 

Suppose our observations Zi , . . . , Z„ are a sample of the form 

Zi = Xi + Yi, 

where the Xi and Yi are independent, and Yi has a (known) normal N(fi, 1) distribution. A natural estimate 
of the first moment of the distribution of the X^ is the estimate 



n 

Tn = ^ Zi - fi. 



(5.1) 
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The MLE of the unknown distribution function of the Xi is the distribution function Fn, maximizing 

= J log J (j){z - X - n) dF{x) dH„(z), 

over F , where H„ is the empirical distribution function of the Z,i and (j) is the standard normal density. So 
another estimate of the first moment of the distribution of the Xi is the estimate 

T^= xdFn{x). 



But a simple calculation, which is omitted here, shows that, in fact, T,' = T„, so the two methods produce 
exactly the same (efficient) estimate here. 

This relation does not hold for higher moments however. We could, for example, estimate the variance of 
the Xi by 

n 

f/„=n-i^(Z,;-Z„)'-l, 

i=i 

where Z„ is the mean of the Zi, but also by 

U'^ = f x^dFnix) - I fxdPnix)^ 



Here we do not get Un — U^; for example J7„ can have negative values, in contrast with U^. On theoretical 
grounds, one would expect the MLE to produce an asymptotically efficient estimate of the variance, but 
looking at simulations, one also would expect this efficiency only to show up for huge sample sizes, because 
of the highly discrete character of the MLE, which only has very few points of mass for moderate sample 
sizes. 

The usual method of producing estimates of F is to first estimate the characteristic function of the data in 
some way, and then use the fact that the characteristic function of the convolution is a product, meaning that 
one can divide by the characteristic function of the distribution of the known component of the deconvolution 
to obtain the characteristic function of the unknown component. This does not necessarily produce efficient 
estimates of the smooth functionals, however, while for the MLE there is a general theory, predicting the 
efiiciency of the estimates of smooth functionals based on the MLE, as also shown in the preceding sections. 

Dividing by the characteristic function of the known component becomes more difficult if this characteristic 
function has zeroes, as in the case of the uniform distribution. In this case the moment estimator (5.1) also 
does not produce an eflacient estimate of the first moment. Deconvolution for the case that Yi has a uniform 
distribution is sometimes called "box-car" deconvolution. We consider here the simplest case, where the 
distributions of the Xi and Yi both have support [0, 1] and the Xi have an absolutely continuous distribution 
function Fq. As discussed in [20] (Exercise 2, section 2.3, p. 61), the model is a special case of the current 
status model in this case. This is seen in the following way. 

Let Ai = l{Zi<i} and let Z'^ be defined by 

y> _ i Z,, if Ai = 1, 

^» - \ - 1, if A, = 0. ^^■^> 

Then Z(, . . . , is distributed as a sample from a Uniform(0, 1) distribution. Moreover, the log likelihood 
for the unknown distribution function F can be written 

n 

£{F) = {A. log F{Zl) + (1 - A,) log{l - F{Z',)} , 
1=1 

and we have, for t,t + h E (0, 1) 

P{A, = 1, Z'i e [t,t + h]} ^V{Zie [t,t + h]} r^h [ dFoiu) = hFoit), hiO, 

Jo 
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and 

P{A, 0, Z[ e [t,t + h]} = ¥{Zi e[l + t,l + t + h]} h dFoiu) = h{l-Fo{t)}, hlO, 

so we get factorization of the current status model for and the corresponding observation Z^. This means, 
by Theorem 5.5 in [20], that 

where 

/.I 

Fo{t){l~Fo{t)}dt. 



This also follows from Example 11.2.3b, p. 226, in [26], where it is at the same time shown that this is the 
efficient asymptotic variance. 

On the other hand, if we would take the moment estimate T„ of (5.1) to estimate the first moment of the 
distribution of the Xi, we would get 

V^S^Tn- J xdFoix)^ ^ N{0,a^), 

where 

^2 = — +var(Xi). 

Since 

va.r{Xi) = 2f x{l ~ Fq{x)} dx - [ f {1 - Fo{x)} dx 
Jo Uo J 

a simple variational argument shows that < cr^, unless Fq is the uniform distribution function, in which 
case ap = cr^. So in this case, the estimate of the first moment, based on the MLE, is more efficient than 
the moment estimate T„, in contrast with what happened for normal deconvolution. 

We now generalize this example to the situation that the convolution kernel is a continuously differentiable 
decreasing density g on [0, 1] and Fq is again an absolutely continuous distribution function, concentrated 
on [0, 1]. As in section 2, wc consider functionals K{F) satisfying condition (Dl), of which the first moment 
of F is the prototype, so 



K{F) = KiFo) + J Kp„ix)diF-Fo){x) + 0{\\F-Fo\\l), 
and we try again to prove: 

||i^„-i^o||'-Op(n-i/2), (5.3) 
and next to prove, using the characterizing properties of the MLE, 

f i^p^Xx)d{K-Fo){x)=n'/^ f 0fJz) d {M^ ~ Ho) (z) + Oj, (n-'^^y (5.4) 



Hq is the distribution function of the convolution and IHI„ the empirical distribution function of the observa- 
tions Zi = Xi + Yi, and where Op^, is given by: 

9p,iz)^E{Kp„{X,)\X,+Y,=z}. (5.5) 

Note the analogy with (2.3) and (2.4) in section 2. Introducing an intermediate function (pp again, just as 
in section 2, we get the representation 

Ofo (z) = , , ' , 5.6 

ho{z) 
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where is absolutely continuous w.r.t. Fq, g is the (decreasing) density of the Yi on [0, 1], and ho is the 
density of the observations Zi. This leads to the integral equation (in (t>Fo)- 

Kpoix) ^ E{eFo{Zi)\Xi = x} ^ — ^ g{z-x)dz, (5.7) 

Jz>x n'0[z) 

which is the adjoint equation to the equation (5.5) (more on the relation between the equation (5.5) and its 
adjoint (5.7) in the deconvolution problem can be found in part 1 of [20]). Note that this is an equation in 
the function (or the measure d(j)pg) rather than the function (/jfo- Also note that we can write 



KFoix) = / 0Fo{z) g{z - x)dz, (5.8) 

Jz>x 

giving a seemingly simpler equation in 9fo- The essential (sometimes ignored) fact, however, is that Ofq has 
to have a representation of the form (5.6) (or, more generally, that it has to be a limit of representations of 
this form: it has to be in the closure of the range of the score operator). Without this restriction on Ofq the 
equation (5.8) would have infinitely many solutions; the restriction that Ofq has to have a representation of 
the form (5.6) or has to be a limit of such representations makes the solution unique, however. 
Defining 0Fo(O) = and using integration by parts we get: 

Iu=o9{z-u) d(t)F„{u) _ g{0)(f>Fo{z) + (f>Fo{u)g'{z ~ u) du 
ho{z) ho{z) 

and by differentiating (5.7) we obtain the integral equation 

4>fA^)+ f A{x, u)4>fM du = - ^"^^)^^o(^) , X g (0, 1), (5.9) 

where the kernel A of the integral equation is given by: 

,f ,_ 9'{x~u) ho{x)g'{u-x) hojx) f'+^'^^ g_^{z~uy{z~x} 

So, just as in section 2, we obtain a Fredholm integral equation of the second kind. 

To give a concrete example of what these functions Ofq and 0^?^ look like, we take g equal to the "elbow 
density" g{x) = 2(1 — x)1[q i^{x) and let _fo be the uniform distribution function on [0, 1]. Furthermore, we 
take 



KFoix) — X — / udFo{u), (5.11) 



A{x,u) = -l[„_i)(a;) r— h ho{x) 



the gradient corresponding to the first moment of the distribution, given by Fq. For this model the kernel 
A{x, u) becomes: 

— — - — + ho{x) irr^- 

ho{u) Jz=xwu rio{z) 

Moreover: 

/ ^(2-2) ,ze [0,1], , . 

(2-z)2 ,ze(i,2], ^^-^2) 

So we get: 

A/ \ 1 / ^ x(2-a;)l[^,i)(u) J f2-xVu\ xAu \ 

A(x, u) = -l(o,.)(.) - ,(2 _ + -(2 - 1 5 log [-^^ ) + ' (5-13) 

for x,u € (0, 1). This immediately points to one reason why solving this type of integral equation is more 
difficult than solving the integral equation we studied in section 2: the kernel of the integral equation is 
unbounded. 
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Furthermore, taking the gradient given by (5.11) and using (5.12) again, the integral equation (5.9) turns 
into ^ 

0Fo(a;)+/ A{x,u)(f)Fa{u)du^ -\x{2- x), X € (5-14) 

This equation was solved numerically and a picture of (pp^ is given in Figure 4. The corresponding Op^, 
defined by (5.6), is shown in Figure 5. The function z M> 9p^^{z) is unbounded near z — 2 and has a cusp at 
z = 1. It can be shown, though, that 

3p^{z)'^hQ[z) dz < (X). (5.15) 



and 6pf^ E ^2(^0), where L2{Ho) is the space of square integrable function w.r.t. dHo which integrate to 
zero w.r.t. dHQ. In fact the left-hand side of (5.15) can be expected to be the (efficient) asymptotic variance 
of 

1^/2 J / xdFr,{x)~- I xdFo{x) 



A picture of 9F„\/~h() is shown in Figure 6. Numerical evaluation of (5.15) gave indeed a value of approximately 
0.137, and, for example, a simulation of 1000 samples of size n = 1000, where the MLE was computed using 
the support reduction algorithm of [15], gave the value 0.139 for n times the variance of the sample estimates 
(we have the impression that the sample variance times n converge to the asymptotic value from above, as 
n — 00; sample size n = 100 gave slightly larger values), so simulations seem to give a nice agreement with 
the conjectured asymptotic variance. 




Fig 4. The function 4>Fq! solving the integral equation (5.14)- 



In order to prove 



where 



1/2 / xdF^ix) - / xdFoix) \ A NiO,al) 



f^o = y ^Foiz)^ho{z) dz, 
we will need the following lemma, which is analogous to Lemma 2.1. 
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Fig 5. The function Op^, defined by (5.6), for the same model as used in Figure 4- 



0.2 




0.0 0.5 1.0 1.5 



Fig 6. The function SfqV '^0; for the same model as used in Figure 4- 



Lemma 5.1. Let the density /i„ be defined by 

K{z) = I g{z- x) dFnix), z e [0, 2]. 
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and let Ti and t„i be the smallest and largest point of jump of Fn, respectively. Furthermore, let, in analogy 
with (5.6), Op be defined by 

„ q(z — u) ddp (u) 

9p (z) = / ^^"^ \ ;,^[r,,l + T^), (5.16) 
where the function 4> p solves the integral equation (in 4>) 

(j){x)+ [ An{x, u)cj){u) du = , X € [ti, T^), (5.17) 



and where the kernel An is given by 



g'jx - u) _^ hnix)g'{u-x) ^ Mx) g'jz - u)g'{z ~ x) 

<?(0) g{Q)K{u) 5(0)2 J^^^^^ ^^^(^^ 



We define (f>p (x) = 0, if x € [0,ti) or x > t„i. Then Op satisfies 

/ Op {z)g{z - x)dz ^ X - udFn{u), X e [ti,t„^). 
Jz>x " J 

The function Op can uniquely and continuously be extended to (0, 2 V (1 + r„j)) such that 

/ Op {z)g{z~x)dz = x~ I udFn{u),xe{Q,l). (5.18) 
Moreover, for Op , extended in this way, we have: 

j xd{Fn~Fo){x)^^ j OpJz)dHo{z). (5.19) 

Remark 5.1. Note that 

hn{z) =0, Z ^ [ti, 1 + T„i). 

Also note that we can have Tm < 1 and > 1, the latter event is more typical. 
To show that (5.19) holds, note that 

3p{z)dHo{z) = J Op^{z) J g{z - x) dFo{x) dz 

^ I [I ^F,M)9{z-^)d2^ dFoix)^ ^'^^"(^)} dFoix)^- J x d{Fn - Fo){x). 

Just as in section 2, we would like to have a relation of the form 

xd{F^ ~ Fo)ix) - J OpJz)d{mn - Ho){z) + 0p {n-'^^) 

instead of (5.19). Proceeding as in section 2, we construct a function (j)p which is absolutely continuous 
w.r.t. Fn and which is close to (j)p . Next we define 

f Lelo.z]9iz-x)d4>pJx) 

I , otherwise. 
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To this end, we define (j)p as the solution of the equation in 



g{z — u)g{z — x) 



(u) = X ~ fx dFn{x) 



, a.e. 



(5.21) 



Note that this is a finite matrix equation which only has to be solved at the points of jump of f„. Then (pp 

is absolutely continuous w.r.t. Fn, since it is a right-continuous piecewise constant function, having (finitely 
many) jumps at the same locations as Fn- It follows from Proposition 2.1, p. 54 in [20] (see also [5] and [14]) 
that 

9{z - n) 



So we get: 



>r, hniz) 



■ dlln{z) — 1, i = 1, . . . ,m. 



Op (z)dH„(z) = 



g{z ~ x) 



dH„(z) ) dcjjp (x) 



hn{z) 

d(j)p^{x) = (j>p^{Tjn) - 4>fS^) = 0- 



(5.22) 



A picture of the functions (j)p and (j)p is shown in Figure 7. 



-0.05 - 



-0.10 - 



-0.15 - 



-0.20 - 




Fig 7. The function <f>p (solid), defined as in Lemma 5.1, and the function <f>p (dashed), defined as the solution of the 
equation (5.21), for the same model as used in Figure 4> and for a sample of size n = 1000; Tm. ~ 1.08617. 

Furthermore, by Theorem 5, p. 522, of [6]: 

ll-F„--Fo||2 = Op (""'^^) ■ 
This suggests, using methods analogous to the methods used in [8], 

J {9pjz) - 9pjz)} dHoiz) ^ Op (n-'/^^ . (5.23) 
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In fact, we have: 

{ep^{z) - 9pJz)}K{z)dz = ^ ^ - 0Fjz)}hn{z)dz 

= \ g{z - x)dz \ d ((j)p - \ (x) = d ~ (j}p ) {x) = 0, 

and hence 

j {ep^^{z)-ep^{z)]ho{z)dz^ J {epjz)-9pjz)}{hoiz)-hniz)}dz. 

We write the integrand on the right-hand side as the product of the functions 

z^ {Opjz) - Opjz)} I K{z) + ^hQ{z) 



and 



z ^ 



\J'hn{z) - \/ha{z) 



It is proved in [6] that 



^ 2 ^ 1/2 

^K{z)-./h^)\ dz\ =0p(n-i/3). 



If it can also be shown that 



2 ^ 1/2 

[K{z) + ho{z)] dz\ -Op(n-i/6), 
we would obtain, using (5.19) to (5.23) and the Cauchy-Schwarz inequality, 

xd{K-Fo)ix)^- J Opjz) dHo{z) = -J ep^{z)dHo{z)+Op(n-'^^^ 

= j (^) d{^n - Ho) (z) + Op {n-'^^) . (5.24) 



A picture of the functions 9p {vKi + and 9p {vhn + ^/ho}/2 is shown in Figure 8, showing that 

these function are really close on the interval [0, 1 + 1 V t™]. Figure 9 compares dp {\fh~n + •\Afl}/2 and 

The only remaining step would be to show that 

^. f 9pJz)d{M„-Ho)iz) = V^ f 0F,Az)d(M.a-Ho){z) + Opil), 



and this representation would again give asymptotic efficiency of the estimate of the first moment, based on 
the MLE. It is clear that the heart of the difhculty of the proof is the unboundedness of the functions Ofq, 
6p and 9p at the right endpoint of the interval on which they are defined, which is caused by the fact that 
the decreasing convolution density g approaches zero at the right endpoint of the interval [0, 1]. 

6. Deconvolution, local limits 

We briefly discuss the conjectured local limit behavior of the MLE. In [11] the following conjecture was 
launched. 
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Fig 8. The function S p {'\/hn + \/ho] / 2 (solid), defined by (5.16) , and the function d p {\/hn + \/ho} / 2 (dashed), defined by 
(5,20), for the same model as used in Figure 4, and for the same sample as used in Figure 4; Tm ~ 1.08617. 
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Fig 9. The function dp {\/h„ + ^/ho} / 2 (solid), defined by (5.16), and the function dp^y/ho (dashed), defined by (5.6), on 
the interval [0, 2], for the same model as used in Figure 4, inrf for the same sample as used in Figure 4. 

Theorem 6.1. (Conjectured theorem in 5.4 [11]), also Theorem in 5.4 in [20].) Let g be a right- continuous 
decreasing density on [0,oo), having only a finite number of discontinuity points ao = < ai < • • • < Um- 
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Moreover, suppose that g has a derivative g'{x) at points x ^ Oi, i — Q, . . . ,m, satisfying 

/ ) \ dx < oo, 

J(0,oo) 9{x) 

where the integrand is defined to be zero at the points Oi and at points x where g is zero, and where g' is 

hounded and continuous on the intervals (a^-i, Oi), i ~ 1, . . . , m + 1, with Um+i oo. 

Furthermore, assume that there exist positive constants ki and k2 such that the derivative g' of g satisfies 
the relation 

\g\t + u)\<k,\g'{t)l 

for all t > and < u < k2, such that ai<t<t + u< ai^i for some i, < i < m. 
Let the convolution density h be given by 

h(z) — J g{z — x) dFQ(x), z > 0, 

where the distribution function Fq of the (non-negative) random variables Xi,l < i < n, is continuously 
differentiable at Zq > 0, with derivative /o(zo) > at zq. Then 

n'/^ {i^W(zo) - ^o(^o)} /o(zo)-'/'{2^(g(a,) - g{a,-)f /h{z^ + a,)} A 2Z, 

"D 

where — > denotes convergence in distribution, and where Z is the last time where standard two-sided Brow- 
nian motion minus the parabola y(t) = reaches its maximum. 

Specializing g to the Uniform distribution on [0, 1], the discontinuity points are ao = and oi = 1, and 
we get for a fixed interior point to G (0, 1): 

n^/'{F„(io) - Fo{to)} / {\Fo{to){l - Foito)fo{to)V^' A 2Z, (6.1) 

where Z is as in the conjectured Theorem 6.1. We know this to be true by the interpretation in terms of the 
current status model, see (5.2) in section 5. 

Specializing g to the standard exponential distribution on [0, oo), gives only one discontinuity point ag = 0, 
and the conjectured theorem then yields, for to > 0: 

n'^^FM - Foito)}/{yo{to)h{to)Y^' A 2Z. (6.2) 

This is also proved to be true in [21]. 

For the more general case of a decreasing density, the conjectured theorem still has not been proved. The 
problem is discussed in some detail in Chapter 5 of [5]. Possibly the assumptions are somewhat too strong. 
In Chapter 5 of [5] the following conditions are used. 

(i) The distribution function Fq is continuous on [0,5*0], and Fq{So) = 1, where Sq < oo. 

(ii) In a neighborhood of to G (0, 5o), i^o is continuously differentiable, with derivative /o, satisfying 

foito) > 0. 

(iii) The density g is bounded, decreasing and continuous on [0, oo) and has compact support [0, Sg]. More- 
over, g has a bounded Lipschitz continuous derivative on {0,Sg). 

Under these assumptions, proving the conjectured Theorem 6.1 again depends on being able to deal with 
certain integral equations. Just as in the case of interval censoring. Theorem 6.1 will follow if we can prove 
that a certain remainder term is a smooth functional, which converges to zero at a faster rate than the cube 
root n rate. The analysis starts by considering the estimate /i„ of the density /iq, generating the observations 
Zi ~ Xi-\-Yi, based 

hn{t) = / g{t-x)dF„{x). 
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where Fn is the MLE. We write this in the form 

Kit) ^ g{0)F^{t) ~ I {g{t-x)^gmdF^{x). (6.3) 

Jx=0 

Note that the second term on the right-hand side of (6.3) has a continuous (but not difFerentiable) integrand 
at in contrast with the preceding representation of We expect 

{g{t ~x)^ g{0)} dPnix) - / {g{t - x) - <?(0)} dF^{x) = (r^-V2^ ^ (5.4) 

to be a smooth functional, which would mean that the dominant asymptotic behavior of hn(t) is given by 

giO)F^it)+ f {g{t~x)~g{0)}dFoix). (6.5) 
Jx=a 

This, in turn, would imply that the MLE would be asymptotically equivalent with a certain "toy estimator" , 
obtained by doing one step of the iterative convex minorant algorithm (for a discussion of this algorithm, see 
[22]), and then the conjectured theorem would hold, as further explained in [11] and [20]. This would give 
for the present model: 

n'^' ( ,tf[Tf,, ) {FM - F,{t,)) A 2Z, (6.6) 

where Z is as in the conjectured Theorem 6.1. 

To prove (6.4), we this time have to deal with the integral equation (5.7) in (p: 

Kt.Fo (x) = / — T—r-^ 9{z - x) dz, 

where in this case: 

KFo{t,x) = {g(t-x)~g{0)}l[o,t){^)~ f {diz - u) - g{0)} dFoiu). (6.7) 

Ju=0 

Assuming 6*0 = 5g = 1 in the conditions above, we get that differentiation leads again to a Fredholm integral 
equation of the second kind: 

4>t,Fo{.x)+ I A{x,u)(j)t,Fo{u)du= \a\2 ^ X e (0,1), (6.8) 

where the kernel A is given by (5.10). But note that this time the expression on the right-hand side has a jump 
discontinuity at t. This also leads to a solution with a jump discontinuity at the same location. For the example 
where Fq is the uniform distribution function on [0, 1] and g the "elbow density" g{x) = 2(1 — x)l[o_i] (x) the 
solution is given in Figure 10, taking t = 0.5. Note the jump at t = 0.5. 
The corresponding efficient influence function 9t,Fa 1 defined by 

. s f n Q(z — U) debt Fn{u) 

Ot,Fo {z) = -^"^"-^^ , ' "^ ' , (6.9) 

is given in Figure 11. 

We can now proceed in a similar way as in the preceding section and define functions 0^ p and 0^ p as in 
(5.16) and (5.20). The result is shown in Figure 12. Note the analogy with what happened in section 3, where 
the proof of the convergence to the asymptotic distribution also was based on a showing that a remainder 
term was of order Op(n~^/^) and asymptotically normal by showing that an integral equation had a solution 
(j)^ p which had a jump discontinuity at a point t where we wanted to determine the local limit of the MLE, 
see (3.2). The difference with the present situation is that in that case a proof is available (in [12]), whereas 
for the deconvolution case the proof is still not completed. 
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Fig 10. The function <t>t,FQ! solving the integral equation (6.8) for t = 0.5, where Fq is the uniform distribution function on 
[0, 1] and g the elbow density g{x) = 2(1 — 2:)l[o,i] (i^) ■ 




Fig 11. The function 9t,Fot defined by (6.9), for t = 0.5, where Fq is the uniform distribution function on [0,1] and g the 
elbow density g(x) = 2(1 — a;)l[Q (x). 



It should be mentioned that the integral equation, determining (pp and 6p^ can be explicitly solved for 
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Fig 12. The function d ^ p {'\/hn + \/ho} /2 (solid), defined by (5.16), and the function 9 _^ p {\/hn + VhH} / 2 (dashed), defined 
by (5.20), for the same model as used in Figure 11, and for the same sample as used in Figure 4; Tm ~ 1.08617. 



exponential deconvolution. Note that we leave the compact support case here. Defining 

Kt{F)^ f {g{t-x)-g{0)}dF{x), 

we get: 



-Kt{Fn){Fn{x) - 1}, X£ [t, T,„], 



where ri and Tm are the first and last point of jump of Fn, respectively. Here we have the unusual situation 
that 4>^ p is almost absolutely continuous w.r.t. F„, which is only spoilt by the jump of p at t. Moreover, 
we get: 

e^ p^{x) ^ -\a,t){z) - Kt{F^). 

We have the following result. 

Theorem 6.2. {Kt{Fn) is a smooth functional in exponential deconvolution.) Let Xi, . . . , be a sample 
from a continuous distribution function Fq, concentrated on [0, od). Furthermore, let Zi, . . . , Z„ be a sample of 
observations of the type Zi — Xi + Yi, where Yi, . . . , are independent of the Xi and standard exponentially 
distributed. Suppose that Fn is the MLE of Fq on the basis of the sample Zi , . . . , Z„ . Then, for each point t 
in the interior of the support of the distribution of the Xi : 

Vn\Kt{Fn)~Kt{Fo)}=Vn f {e-^'-^^ - l} d{F„ - Fo){x) ^ N{0,a^), (6.10) 

where Af{0,af) denotes a normal distribution with first moment zero and variance a1 , given by: 

't,Foiz)^dHo{z), 

and where (the score function) ot^Po given by: 

et,Fo{z) = -l[o,t){z) - Kt{Fo), z > 0. (6.11) 
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Using this result, we get a new proof of the local limit result (6.2) above, since (6.4) holds, and we get 
that the MLE is asymptotically equivalent to the "toy estimator" , obtained by doing one step of the iterative 
convex minorant algorithm, starting the iterations with the underlying distribution function Fq. 

7. Concluding remarks. 

It is shown that further development of the local limit theory of the MLE for interval censoring and deconvo- 
lution crucially depends on getting grip on the associated integral equations. The same holds for the MSLE 
for interval censoring, which converges locally at a faster rate than the MLE, under appropriate smoothing 
conditions. Probably similar results will follow for the MSLE for deconvolution, but this is not discussed 
above. 

It is also shown that the MLE can be expected to be asymptotically efficient in the estimation of smooth 
functionals, a property it attains automatically, without any smoothing. For deconvolution this property will 
usually not hold automatically for the Fourier type estimators which are commonly applied in this situation, 
using kernel estimators in the Fourier domain. The theory also produces answers to the often posed question: 
"Why maximum likelihood?" . One answer is: it produces automatically efficient estimates, in contrast with 
other methods. Nowadays, these estimates also can be easily computed, for example using the support 
reduction algorithm of [15] or some hybrid form of the EM algorithm, combined with the iterative convex 
minorant algorithm, as was used for computing the MSLE in [13]. 
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